15 research outputs found

    Rapid Visual Categorization is not Guided by Early Salience-Based Selection

    Full text link
    The current dominant visual processing paradigm in both human and machine research is the feedforward, layered hierarchy of neural-like processing elements. Within this paradigm, visual saliency is seen by many to have a specific role, namely that of early selection. Early selection is thought to enable very fast visual performance by limiting processing to only the most salient candidate portions of an image. This strategy has led to a plethora of saliency algorithms that have indeed improved processing time efficiency in machine algorithms, which in turn have strengthened the suggestion that human vision also employs a similar early selection strategy. However, at least one set of critical tests of this idea has never been performed with respect to the role of early selection in human vision. How would the best of the current saliency models perform on the stimuli used by experimentalists who first provided evidence for this visual processing paradigm? Would the algorithms really provide correct candidate sub-images to enable fast categorization on those same images? Do humans really need this early selection for their impressive performance? Here, we report on a new series of tests of these questions whose results suggest that it is quite unlikely that such an early selection process has any role in human rapid visual categorization.Comment: 22 pages, 9 figure

    A Focus on Selection for Fixation

    Get PDF
    A computational explanation of how visual attention, interpretation of visual stimuli, and eye movements combine to produce visual behavior, seems elusive. Here, we focus on one component: how selection is accomplished for the next fixation. The popularity of saliency map models drives the inference that this is solved, but we argue otherwise. We provide arguments that a cluster of complementary, conspicuity representations drive selection, modulated by task goals and history, leading to a hybrid process that encompasses early and late attentional selection. This design is also constrained by the architectural characteristics of the visual processing pathways. These elements combine into a new strategy for computing fixation targets and a first simulation of its performance is presented. A sample video of this performance can be found by clicking on the "Supplementary Files" link under the "Article Tools" heading

    Focusing on Selection for Fixation

    Get PDF
    Building on our presentation at MODVIS 2015, we continue in our quest to discover a functional, computational, explanation of the relationship among visual attention, interpretation of visual stimuli, and eye movements, and how these produce visual behavior. Here, we focus on one component, how selection is accomplished for the next fixation. The popularity of saliency map models drives the inference that this is solved; we suggested otherwise at MODVIS 2015. Here, we provide additional empirical and theoretical arguments. We then develop arguments that a cluster of complementary, conspicuity representations drive selection, modulated by task goals and history, leading to a blended process that encompasses early, mid-level and late attentional selection and reflects the differences between central and peripheral processes. This design is also constrained by the architectural characteristics of the visual processing pathways, specifically, the boundary problem, as well as retinal photoreceptor distribution. These elements combine into a new strategy for computing fixation targets and a first simulation of its performance is presented

    SMILER: Consistent and Usable Saliency Model Implementations

    Get PDF
    The Saliency Model Implementation Library for Experimental Research (SMILER) is a new software package which provides an open, standardized, and extensible framework for maintaining and executing computational saliency models. This work drastically reduces the human effort required to apply saliency algorithms to new tasks and datasets, while also ensuring consistency and procedural correctness for results and conclusions produced by different parties. At its launch SMILER already includes twenty three saliency models (fourteen models based in MATLAB and nine supported through containerization), and the open design of SMILER encourages this number to grow with future contributions from the community. The project may be downloaded and contributed to through its GitHub page:https://github.com/tsotsoslab/smile

    An Evaluation of Saliency and Its Limits

    Get PDF
    The field of computational saliency modelling has its origins in psychophysical studies of visual search and low-level attention, but over the years has heavily shifted focus to performance-based model development and benchmarking. This dissertation examines the current state of saliency research from the perspective of its relationship to human visual attention, and presents research along three different but complementary avenues: a critical examination of the metrics used to measure saliency model performance, a software library intended to facilitate the exploration of saliency model applications outside of standard benchmarks, and a novel model of fixation control that extends fixation prediction beyond a static saliency map to an explicit prediction of an ordered sequence of saccades. The examination of metrics provides a more direct window into algorithm spatial bias than competing methods, as well as presents evidence that spatial considerations cannot be completely isolated from stimulus appearance when accounting for human fixation locations. Experimentation over psychophysical stimuli reveals that many of the most recent models, all which achieve high benchmark performance for fixation prediction, fail to identify salient targets in basic feature search, more complex singleton search, and search asymmetries, suggesting an overemphasis on the specific performance benchmarks that are widely used in saliency modelling research and a need for more diverse evaluation. Further experiments are performed to test how different saliency algorithms predict fixations across space and time, finding a consistent spatiotemporal pattern of saliency prediction across almost all tested algorithms. The fixation control model outperforms competing methods at saccade sequence prediction according to a number of trajectory-based metrics, and produces qualitatively more human-like fixation traces than those sampled from static maps. The results of these studies together suggest that the role of saliency should not be viewed in isolation, but rather as a component of a larger visual attention system, and this work provides a number of tools and techniques that will facilitate further understanding of visual attention
    corecore